.TH "UNICORN" "3" "Jan 27th 2025" "Unicorn 1.0.4"
.SH NAME
uni_norm \- normalize text
.SH LIBRARY
Embeddable Unicode Algorithms (libunicorn, -lunicorn)
.SH SYNOPSIS
.nf
.B #include <unicorn.h>
.PP
.BI "unistat uni_norm(uninormform " form ", const void *" src ", unisize " src_len ", uniattr " src_attr ", void *" dst ", unisize *" dst_len ", uniattr " dst_attr ");"
.fi
.SH DESCRIPTION
Normalizes \f[I]src\f[R] into the normalization form specified by \f[I]form\f[R] and writes the result to \f[I]dst\f[R].
If \f[I]dst\f[R] is not \f[C]NULL\f[R], then the implementation writes to \f[I]dst_len\f[R] the total number of code units written to \f[I]dst\f[R].
If the capacity of \f[I]dst\f[R] is insufficient, then \f[B]UNI_NO_SPACE\f[R] is returned otherwise it returns \f[B]UNI_OK\f[R].
.PP
If \f[I]dst\f[R] is \f[C]NULL\f[R], then \f[I]dst_len\f[R] must be zero.
If \f[I]dst\f[R] is \f[C]NULL\f[R], then the function writes to \f[I]dst_len\f[R] the number of code units in the fully normalized text and returns \f[B]UNI_OK\f[R].
Call the function this way to first compute the total size needed for the destination buffer, then call it again with a sufficiently-sized buffer.
.SH RETURN VALUE
.TP
UNI_OK
On success.
.TP
UNI_BAD_OPERATION
If \f[I]src\f[R] is \f[C]NULL\f[R], if \f[I]dst_len\f[R] is negative, or if \f[I]dst\f[R] is NULL and \f[I]dst_len\f[R] is greater than zero.
.TP
UNI_BAD_ENCODING
If \f[I]src\f[R] is malformed; this is never returned if \f[I]src_attr\f[R] has \f[B]UNI_TRUST\f[R](3).
.TP
UNI_NO_SPACE
If \f[I]dst\f[R] lacks the capacity to store the normalization of \f[I]src\f[R].
.TP
UNI_NO_MEMORY
If dynamic memory allocation failed.
.TP
UNI_FEATURE_DISABLED
If Unicorn was built without support for normalizing to \f[I]form\f[R].
.SH EXAMPLES
This example normalizes the input string to Normalization Form D. This form is ideal for in-memory string comparison because it is quick to compute.
For persistent storage or transmission, Normalization Form C is preferred.
.PP
.in +4n
.EX
#include <unicorn.h>
#include <stdio.h>

int main(void)
{
    const char *in = u8"Åström";
    char out[32];
    unisize outlen = sizeof(out);

    if (uni_norm(UNI_NFD, in, -1, UNI_UTF8,  out, &outlen, UNI_UTF8) != UNI_OK)
    {
        // something went wrong
        return 1;
    }

    printf("%.*s", outlen, out);
    return 0;
}
.EE
.in
.SH SEE ALSO
.BR unistat (3),
.BR UNI_TRUST (3),
.BR uninormform (3),
.BR unisize (3),
.BR uniattr (3)
.SH AUTHOR
.UR https://railgunlabs.com
Railgun Labs
.UE .
.SH INTERNET RESOURCES
The online documentation is published on the
.UR https://railgunlabs.com/unicorn
Railgun Labs website
.UE .
.SH LICENSING
Unicorn is distributed with its end-user license agreement (EULA).
Please review the agreement for information on terms & conditions for accessing or otherwise using Unicorn and for a DISCLAIMER OF ALL WARRANTIES.
